Skip to content

Conversation

@tausbn
Copy link
Contributor

@tausbn tausbn commented Jan 7, 2026

WIP

@github-actions github-actions bot added the Python label Jan 7, 2026
@tausbn tausbn force-pushed the tausbn/python-add-dataflow-overlay-annotations branch from e203d8b to 97e2376 Compare January 9, 2026 16:27
tausbn added 15 commits January 30, 2026 12:50
Removes the dependence on the (global) `ModuleVariableNode.getARead()`,
by adding a local version (that doesn't include `import *` reads)
instead.
This may result in more nodes, but it should still be bounded by the
number of global variables in the source code.
With `ModuleVariableNode`s now appearing for _all_ global variables (not
just the ones that actually seem to be used), some of the tests changed
a bit. Mostly this was in the form of new flow (because of new nodes
that popped into existence). For some inline expectation tests, I opted
to instead exclude these results, as there was no suitable location to
annotate. For the normal tests, I just accepted the output (after having
vetted it carefully, of course).
Explicitly adds a bunch of nodes that were previously (using a global
analysis) identified as `ExtractedArgumentNode`s. These are then
subsequently filtered out in `argumentOf` (which is global) by putting
the call to `getCallArg` there instead of in the charpred.
Fixes the test failures that arose from making `ExtractedArgumentNode`
local.

For the consistency checks, we now explicitly exclude the
`ExtractedArgumentNode`s (now much more plentiful due to the
overapproximation) that don't have a corresponding `getCallArg` tuple.

For various queries/tests using `instanceof ArgumentNode`, we instead us
`isArgumentNode`, which explicitly filters out the ones for which
`isArgumentOf` doesn't hold (which, again, is the case for most of the
nodes in the overapproximation).
Uses the same trick as for `ExtractedArgumentNode`, wherein we postpone
the global restriction on the charpred to instead be in the `argumentOf`
predicate (which is global anyway).

In addition to this, we also converted `CapturedVariablesArgumentNode`
into a proper synthetic node, and added an explicit post-update node for
it. These nodes just act as wrappers for the function part of call
nodes. Thus, to make them work with the variable capture machinery, we
simply map them to the closure node for the corresponding control-flow
or post-update node.
As we now have many more capturing closure arguments, we must once again
exclude the ones that don't actually have `argumentOf` defined.
New nodes means new results. Luckily we rarely have a test that selects
_all_ dataflow nodes.
... and everything else that it depends on.
None of these required any changes to the dataflow libraries, so it
seemed easiest to put them in their own commit.
@tausbn tausbn force-pushed the tausbn/python-add-dataflow-overlay-annotations branch from 97e2376 to 9e43da9 Compare January 30, 2026 13:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants